This report summarizes the results of the bootstrapping change-point approach to select the diagnostic opportunity window for Blastomycosis. For this report we analyzed opportunity windows ranging from 10 to 100 days prior to the index diagnosis.
The following approach was used for this report. For a given range of opportunity bound, OB, (e.g., 50 days before diagnosis) do the following:
Generate 100 bootstrapped samples, by selecting individual patients with replacement. For each sample, compute the number of visits (any or SSD) each day before diagnosis.
For a given bootstrap and opportunity bound, OB, perform the following:
Estimate Expected Trends - Estimate linear, quadratic, and cubic trends over the control window from 365 to OB days before diagnosis.
Extrapolate this trend forward into the implied opportunity window, from OB to 1 day before diagnosis.
Estimate Excess Trends - Compute the residuals during the opportunity window (these are the “excess” visits) as the difference between the observed values and the expected trend from step 2. Use a LOESS model to fit this excess trend.
Compute the final fitted trend as the sum of the expected and excess trends.
Using the final estimated trend, compute in-sample and out-of-sample performance defined by the following:
In-sample - Compare the observed and fitted values within the bootstrapped sample used to fit the model
Aggregate Out-of-sample - Compare the observed values from the aggregated (non-bootstrapped) patient data to the fitted values from the bootstrap samples.
K-fold (99-fold) Out-of-sample - Compare the observed values from the other bootstrapped sample to the fitted values from a given bootstrap sample.
Repeat steps 2-3 across the range of opportunity bounds from 10 to 100 days before diagnosis and bootstrap samples.
Aggregate performance metrics across bootstrap samples.
The following figure depicts the number of patients with any visit or an SSD-related visit each day prior to diagnosis:
This section summarizes results using counts of SSD-related visits.
The following figure depicts the in-sample and out-of-sample performance (MSE) of various bounds on the opportunity window and different trends.
The following table depicts the top 10 specifications based on either aggregate or k-fold out-of-sample performance:
| rank | Bound (Days) | Model | MSE | Bound (Days) | Model | MSE |
|---|---|---|---|---|---|---|
| 1 | 50 | Cubic | 114.01 | 50 | Cubic | 195.73 |
| 2 | 51 | Cubic | 114.40 | 65 | Quadratic | 195.93 |
| 3 | 65 | Quadratic | 114.68 | 51 | Cubic | 196.10 |
| 4 | 49 | Cubic | 114.70 | 57 | Cubic | 196.43 |
| 5 | 57 | Cubic | 114.90 | 65 | Cubic | 196.48 |
| 6 | 54 | Cubic | 114.97 | 66 | Quadratic | 196.52 |
| 7 | 53 | Cubic | 115.07 | 49 | Cubic | 196.56 |
| 8 | 52 | Cubic | 115.09 | 58 | Cubic | 196.74 |
| 9 | 48 | Cubic | 115.09 | 54 | Cubic | 196.77 |
| 10 | 65 | Cubic | 115.13 | 59 | Cubic | 196.78 |
The following figure depicts the observed and expected trend for the top 4 models based on aggregate out-of-sample performance:
The following figure depicts the observed and expected trend for the top 4 models based on 99-fold out-of-sample performance:
The following table depicts the 10 best models for each trend, based on aggregate out-of-sample performance:
| Rank | Bound | MSE | Bound | MSE | Bound | MSE |
|---|---|---|---|---|---|---|
| 1 | 96 | 129.86 | 65 | 114.68 | 50 | 114.01 |
| 2 | 95 | 130.51 | 66 | 115.28 | 51 | 114.40 |
| 3 | 92 | 130.63 | 57 | 115.44 | 49 | 114.70 |
| 4 | 98 | 130.64 | 67 | 115.56 | 57 | 114.90 |
| 5 | 97 | 130.73 | 58 | 115.68 | 54 | 114.97 |
| 6 | 94 | 130.95 | 59 | 115.75 | 53 | 115.07 |
| 7 | 93 | 130.98 | 64 | 115.77 | 52 | 115.09 |
| 8 | 99 | 131.12 | 63 | 115.86 | 48 | 115.09 |
| 9 | 100 | 131.50 | 68 | 115.97 | 65 | 115.13 |
| 10 | 91 | 131.72 | 62 | 115.99 | 55 | 115.14 |
The following table depicts the 10 best models for each trend, based on 99-fold out-of-sample performance:
| Rank | Bound | MSE | Bound | MSE | Bound | MSE |
|---|---|---|---|---|---|---|
| 1 | 96 | 209.91 | 65 | 195.93 | 50 | 195.73 |
| 2 | 92 | 210.49 | 66 | 196.52 | 51 | 196.10 |
| 3 | 95 | 210.54 | 67 | 196.81 | 57 | 196.43 |
| 4 | 98 | 210.71 | 57 | 196.91 | 65 | 196.48 |
| 5 | 97 | 210.81 | 64 | 197.09 | 49 | 196.56 |
| 6 | 93 | 210.85 | 58 | 197.19 | 58 | 196.74 |
| 7 | 94 | 210.86 | 63 | 197.20 | 54 | 196.77 |
| 8 | 99 | 211.31 | 59 | 197.22 | 59 | 196.78 |
| 9 | 91 | 211.42 | 68 | 197.27 | 55 | 196.79 |
| 10 | 100 | 211.59 | 62 | 197.35 | 56 | 196.83 |
The following figure depicts the top 4 performing linear models based on aggregate out-of-sample MSE:
The following figure depicts the top 4 performing linear models based on 99-fold out-of-sample MSE:
The following figure depicts the top 4 performing quadratic models based on aggregate out-of-sample MSE:
The following figure depicts the top 4 performing quadratic models based on 99-fold out-of-sample MSE:
The following figure depicts the top 4 performing cubic models based on aggregate out-of-sample MSE:
The following figure depicts the top 4 performing cubic models based on 99-fold out-of-sample MSE:
This section summarizes results using counts of all visits.
The following figure depicts the in-sample and out-of-sample performance of various bounds on the opportunity window and different trends.
The following table depicts the top 10 specifications based on both aggregate and k-fold out-of-sample performance:
| rank | Bound (Days) | Model | MSE | Bound (Days) | Model | MSE |
|---|---|---|---|---|---|---|
| 1 | 44 | Cubic | 389.49 | 44 | Cubic | 643.81 |
| 2 | 57 | Quadratic | 390.43 | 57 | Quadratic | 645.05 |
| 3 | 43 | Cubic | 390.66 | 43 | Cubic | 645.08 |
| 4 | 57 | Cubic | 390.73 | 58 | Quadratic | 645.41 |
| 5 | 58 | Quadratic | 390.81 | 57 | Cubic | 645.46 |
| 6 | 58 | Cubic | 391.38 | 56 | Quadratic | 645.99 |
| 7 | 56 | Cubic | 391.51 | 56 | Cubic | 646.07 |
| 8 | 56 | Quadratic | 391.58 | 58 | Cubic | 646.07 |
| 9 | 45 | Cubic | 393.37 | 45 | Cubic | 647.39 |
| 10 | 59 | Quadratic | 393.92 | 59 | Quadratic | 648.09 |
The following figure depicts the observed and expected trend for the top 4 models based on aggregate out-of-sample performance:
The following figure depicts the observed and expected trend for the top 4 models based on k-fold out-of-sample performance:
The following table depicts the 10 best models for each trend, based on aggregate out-of-sample performance:
| Rank | Bound | MSE | Bound | MSE | Bound | MSE |
|---|---|---|---|---|---|---|
| 1 | 91 | 431.11 | 57 | 390.43 | 44 | 389.49 |
| 2 | 90 | 432.59 | 58 | 390.81 | 43 | 390.66 |
| 3 | 96 | 432.86 | 56 | 391.58 | 57 | 390.73 |
| 4 | 97 | 433.02 | 59 | 393.92 | 58 | 391.38 |
| 5 | 92 | 433.51 | 55 | 396.89 | 56 | 391.51 |
| 6 | 98 | 433.76 | 60 | 396.89 | 45 | 393.37 |
| 7 | 93 | 434.01 | 61 | 397.12 | 59 | 394.46 |
| 8 | 95 | 434.09 | 65 | 397.18 | 50 | 394.48 |
| 9 | 94 | 434.83 | 62 | 397.70 | 51 | 394.58 |
| 10 | 99 | 435.11 | 63 | 397.89 | 55 | 395.31 |
The following table depicts the 10 best models for each trend, based on 99-fold out-of-sample performance:
| Rank | Bound | MSE | Bound | MSE | Bound | MSE |
|---|---|---|---|---|---|---|
| 1 | 91 | 687.74 | 57 | 645.05 | 44 | 643.81 |
| 2 | 90 | 689.26 | 58 | 645.41 | 43 | 645.08 |
| 3 | 96 | 689.86 | 56 | 645.99 | 57 | 645.46 |
| 4 | 97 | 689.86 | 59 | 648.09 | 56 | 646.07 |
| 5 | 98 | 690.24 | 60 | 650.81 | 58 | 646.07 |
| 6 | 92 | 690.56 | 61 | 651.08 | 45 | 647.39 |
| 7 | 93 | 690.93 | 55 | 651.24 | 59 | 648.74 |
| 8 | 95 | 691.24 | 65 | 651.34 | 50 | 648.81 |
| 9 | 99 | 691.50 | 44 | 651.57 | 51 | 648.93 |
| 10 | 100 | 691.80 | 62 | 651.59 | 54 | 649.73 |
The following figure depicts the top 4 performing linear models based on out-of-sample MSE
The following figure depicts the top 4 performing quadratic models based on out-of-sample MSE
The following figure depicts the top 4 performing cubic models based on out-of-sample MSE